Skip to content

Conversation

@electron271
Copy link

https://rocm.docs.amd.com/en/latest/reference/gpu-arch-specs.html most non instinct gpus support 32 warp size

tested on RX 9070 XT, looking into getting this tested on amd instinct accelerators to ensure gpus with 64 warp size still work

@matthewdouglas
Copy link
Member

Thanks for the PR! I don't have the bandwidth to test this personally at the moment, so will defer to AMD team. Also I do not have any RDNA GPUs on hand.

cc: @pnunna93

@github-actions
Copy link

github-actions bot commented Sep 9, 2025

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@pnunna93 pnunna93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! It's good to go once warp size change is made.

@matthewdouglas
Copy link
Member

Hi @electron271
There's still a couple conflicts, mostly because we removed all of the imports related to IPEX. If you don't mind fixing those I think we can merge after that! Thanks!

matthewdouglas
matthewdouglas previously approved these changes Oct 3, 2025
@matthewdouglas matthewdouglas added this to the v0.49.0 milestone Oct 3, 2025
@electron271
Copy link
Author

will look through all this soon, sorry have been somewhat busy

@electron271 electron271 requested a review from pnunna93 October 24, 2025 01:32
@matthewdouglas
Copy link
Member

Hi,
It looks like this breaks build compatibility for ROCm 6.1. I would be OK with dropping ROCm 6.1 compatibility if @pnunna93 agrees, but otherwise we would need to fix that build as well.

Apart from that, just a few linting issues to fix.

@pnunna93
Copy link
Contributor

Hi, It looks like this breaks build compatibility for ROCm 6.1. I would be OK with dropping ROCm 6.1 compatibility if @pnunna93 agrees, but otherwise we would need to fix that build as well.

Apart from that, just a few linting issues to fix.

I agree, we can deprecate 6.1 compatibility

@matthewdouglas
Copy link
Member

I've opened #1788 which removes the ROCm 6.1 build.

@sstamenk
Copy link
Contributor

sstamenk commented Nov 5, 2025

Did some regression testing compared to the main branch on W7900 (gfx1100), R9700 (gfx1201) and MI300x (gfx942) using the rocm/vllm:latest Docker image. There don't seem to be any regressions. Out of the 804 newly enabled tests on gfx1100 and gfx1201, 156 fail due to accuracy issues while the other 648 pass. Attaching some logs:

@matthewdouglas
Copy link
Member

Thanks @sstamenk - that's quite useful! The failing tests seem to be mostly gemv with fp32. I think that's OK for now and can be addressed separately.

@electron271 If we fix the lint issues and merge conflict I'm happy to merge this in!

import bitsandbytes as bnb
from bitsandbytes import functional as F
from bitsandbytes.cextension import HIP_ENVIRONMENT
from bitsandbytes.cextension import HIP_ENVIRONMENT, ROCM_GPU_ARCH, ROCM_WARP_SIZE_64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ROCM_GPU_ARCH is unused, can be removed



ROCM_GPU_ARCH = get_rocm_gpu_arch()
ROCM_WARP_SIZE_64 = True if get_rocm_warpsize() == 64 else False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename ROCM_WARP_SIZE_64 and get_rocm_warpsize() to something generic like WARP_SIZE_64 and get_warpsize() since it technically covers both the cases for HIP and CUDA? Would also make more sense for the unit test skip conditions. @matthewdouglas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants